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This chapter deals with convolutional coding. Chapter 6 presented the fundamen¬ 
tals of linear block codes, which are described by two integers, n and fc. and a gen¬ 
erator matrix or polynomial. The integer k is the number of data bits that form an 
input to a block enccxlcr. The integer n is the total number of bits in the associated 
codeword out of the encoder. A characteristic of linear block codes is that each 
codeword n-iuple is uniquely determined by the input message /r-tuple. The ratio 
k/n is called the rate of the code—a measure of the amount of added redundancy. 
A cffnvfflutional code is described by three integers, n, k, and K, where the ratio k/n 
has the same code rate significance (information per coded bit) that it has for block 
codes; however, n does not define a block or codeword length as it does for block 
codes. The integer K is a parameter known as the constraint length: it represents 
the number of A:-luple stages in the encoding shift register. An important character¬ 
istic of convolutional codes, different from block codes, is that the encoder has 
memory—the /i-tuple emitted by the convolutional encoding procedure is not only 
a function of an input A:-tuple. but is also a function of the previous /C - I input 
/.•-tuples, In practice, n and k are small integers and K is varied to control the capa¬ 
bility and complexity of the code- 


7.1 CONVOLUTIONAL ENCODING 

In Figure 1.2 we presented a typical block diagram of a digital communication sys¬ 
tem. A version of this functional diagram, focusing primarily on the convolutional 
encode/decode and modulate/demodulate portions of the communication link, is 
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shown in Figure 7.1. The input message source is denoted by the sequence m-mi. 

m 2 .m,.where each m, represents a binary digit (bit), and i is a time index. 

To be precise, one should denote the elements of m with an index for class mem¬ 
bership (e.g.. for binary codes. 1 or 0) and an index for time. However, in this chap¬ 
ter, for simplicity, indexing is only used to indicate lime (or location within a 
sequence). We shall assume that each m, is equally likely to be a one or a zero, and 
independent from digit to digit. Being independent, the bit sequence lacks any re¬ 
dundancy; that is, knowledge about bit m, gives no information about m,(i^ j). The 
encoder transforms each sequence m into a unique codeword sequence U = G(ni). 
Even though the sequence m uniquely defines Ihc sequence U. a key feature of 
convolutional codes is that a given A:-tuple within ro does nor uniquely define its as¬ 
sociated n-tuple within U since the encoding of each A:-tuple is not only a function 
of that /c-luple but is also a function of the - 1 input A*-tuples that precede it. The 
sequence U can be partitioned into a sequence of branch words: D = Gi, t/i, ... , 
Each branch word is made up of binary code symbols, often called chan¬ 
nel symbols, channel bits, or code hits; unlike ihe inpul message bils the code sym¬ 
bols are not independent. 

In a typical communication application. Ihc codeword .sequence U modulates 
a waveform s{/). During transmission, the waveform .v(f) is corrupted by noise, re¬ 
sulting in a received waveform i(r) and a demodulated sequence Z = Z^. , 

Z,,..., as indicated in Figure 7.1. The task of Ihe decoder is to produce an estimate 
III = W|. m 2 . rh,.... of the original message sequence, using the received se¬ 

quence Z together with a priori knowledge of the encoding procedure. 

A general convolutional encoder, shown in Figure 7.2. is mechanized with a 
A:/C siage shift register and mDdulo-2 adders, where K is the constraint length. 



ra * Azy ..., mj,... Z Z 2 ,... ,Z„ 

where Z| = zij,..., zji ,... Zni 

and zji Is the ^'th demodulator output 

symbol of branch word Zi 


Figure 7.1 Encode/decode and modulate/demodulate portions of a 
communication link. 
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1 2 3 ... kK 


m = THZf -• t ••• 

Input sequence 
(shifted in A at a time) 



AAT'Stage 
shift register 


n moduJO'2 
adders 


Codeword sequence U = f/i, C/j, •• 

where U, * u ^^.... ,Uji ,... u„i 

* ith codeword branch 
ujj binary code symbol 
of branch word Ui 


Figure 7.2 Convolutional encoder with constraint length K and rat© k/n. 

The constraint length represents the number of A-bil shifts over which a single in¬ 
formation bit can inlluence the encoder output. At each unit <»f time, k bits arc 
shifted into the first k stages of the register: all bits in the register are shifted k 
stages to the right, and the outputs of the n adders are sequentially sampled to yield 
the binary code symbols or code bits. These code symbols are then used by the 
modulator to specify the waveforms to be transmitted over the channel. Since there 
are n code bits for each input group of k message bits, the code rate is k/n message 
bit per code bit. where k < n. 

Wc shall consider only the most commonly used binary convolutional en¬ 
coders for which k = I—that is. those encoders in which the message bits are shifted 
into the encoder one bit at a time, although generalisation to higher order alpha¬ 
bets is straightforward (1. 2). For the A » 1 encoder, at the ilb unit of lime, message 
bit nil is shifted into the first shift register stage; all previous bits in the register are 
shifted one stage to the right, and as in the more general case, the outputs of the n 
adders are sequentially sampled and transmitted. Since there are n code bits tor 
each message bit. the code rate is Vn. The n code symbols occurring at lime t, com¬ 
prise the flh branch word. U, = i< 2 ,» • • • ^ 'vherc (y = 1. 2.n) is the /th 

code symbol belonging to the ilh branch word. Note that for the rate \ln encoder, 
the A/C-siagc shift register can be referred to simply as a R'-slage register, and the 
constraint length K, which was expressed in units of A-lupIc stages, can be referred 
to as constraint length in units of bits. 


7.2 CONVOLUTIONAL ENCODER REPRESENTATION 

To describe a convolutional code, one needs to characterize the encoding function 
G(m). so that given an input sequence m. one can readily compute the output se¬ 
quence U. Several methods are used for representing a convolutional encoder, Ihe 
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most popular being the connecUon pictorial conneciion vectors or polynomials, the 
state diagramy the tree diagram, and the trellis diagram. They are each described 
below. 

7.2.1 Connection Representation 

We shall use the amvolutional encoder, shown in Figure 7.3, as a model tor dis¬ 
cussing convolutional encoders. The figure illustrates a (2. J) convolutional en¬ 
coder with constraint length K = 3. There arc n = 2 modulo-2 adders; thus the code 
rate k/n is i At each input bit lime, a bit is shifted into the leftmost stage and the 
bits in the register are shifted one position to the right. Next, the output switch 
samples the output of each modulo-2 adder (i.e., first the upper adder, then the 
lower adder), thus forming the code symbol pair making up the branch word asso¬ 
ciated with the bit just inputted. The sampling is repealed for each inputted bit. 
The choice of connections between the adders and the stages of the register gives 
rise to the characteristics of the code. Any change in the choice of connections re¬ 
sults in a different code. 'ITic connections are, of course, not chosen or changed ar¬ 
bitrarily. The problem of choosing connections to yield good distance properties is 
complicated and has not been solved in general: however, good codes have been 
found by computer search for all constraint lengths less than about 20 [3-5]. 

Unlike a block code that has a fixed word length n. a convolutional code has 
no particular block si;^e. However, convolutional codes arc often forced into a 
block structure by periodic tnmeatiort. This requires a number of zero bits to be ap¬ 
pended to the end of the input data sequence, for the purpose of clearing or flush¬ 
ing the encoding shift register of the data bits. Since the added zeros carry no 
information, the effective code rate falls below k/n. To keep the code rale close to 
k/n, the truncation period is generally made as long as practical. 

One way to represent the encoder is to specify a .set of n connection vectors, 
one fur each of the n modulo-2 adders. Each vector has dimension K and describes 
the connection of the encoding shift register to that modulo-2 adder. A one in the 
/Ih position of the vector indicates that the corresponding slage in the shift register 


Input bit 
m 



Figure 7.3 Convolutional encoder (rale g. K= 3). 


7.2 Convolutional Encoder Representation 
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IS connected to the modulo -2 adder, and a zero in a given position indicates that no 
connection exists between the stage and the modulo-2 adder. For the encoder ex¬ 
ample in Figure 6.3. we can write the connection vector gi for the upper connec¬ 
tions and g: for the lower connections as follows: 

gi = 1 1 1 
1 0 1 

Now consider that a message vector m = 1 0 1 is convolutionally encoded with the 
encoder shown in Figure 7.3. The three message bits are inputted, one at a time, at 
times I 2 . and ty, as shown in Figure 7.4. Subsequently, (A’ - 1) = 2 zeros are in¬ 
putted at times r* and ^5 to flush the register and thus ensure that the tail end of the 
message is shifted the full length of the register. The output sequence is seen to be 
1 1 1 0 0 0 I 0 1 1 , where the leftmost symbol represents the earliest transmission. 
The entire output sequence, including the code symbols as a result of flu.shing. are 
needed to decode the message. To flush the message from the encoder requires 
one less zero than the number of stages in the register, or A - 1 flush bits. Another 
zero input is shown at time for the reader to verify that the Hushing is completed 
at time Thus, a new message can be entered at time f*. 

7.2,Ll Impulse Response of the Encoder 

We can approach the encoder in terms of its impulse revpmvc—that is, the 
response of the encoder to a single “one'* bit that moves through it. Consider the 
contents of the register in Figure 7.3 as a one moves through it: 

Branch word 


Register - 

contents «! w? 

100 I I 

010 I 0 

001 1 1 


Input sequence: 1 0 0 

Output sequence: 1* I 10 11 

The output sequence for the input ‘'one" is called the impulse response o1 the en¬ 
coder. TTien, for the input sequence m * 11 ) 1 , the output may be found by the su¬ 
perposition or the linear addilior^ of the time-shifted input “impulses'* as follows: 


Input m 


Output 



t 

11 

10 

11 



0 


00 

00 

00 


1 



11 

10 

11 

ModuIo-2 sum: 

1 I 

10 

00 

10 

11 


Observe that this is the same output as that obtained in Figure 7.4. demonstrating 
that convolutional codes are linear —^just like the linear block codes of Chapter 6 . It 


386 


Channel Coding: Pari 2 Chap. 7 



Figure 7.4 Convolutionally encoding a message 
sequence with a rate |, /(*= 3 encoder. 


m = 101 
Time 


Encoder 


U 


ti 


Encoder 




^3 





Output 

U-[ U2 
1 1 

1 0 

0 0 

1 0 

1 1 

0 0 


7.2 Convolutional Encoder Representation 
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is from this property of generating the output by the linear addition of time-shifted 
impulses, or the convolution of the input sequence with the impulse response of the 
encoder, that we derive the name convolutional encoder. Often, this encoder char- 
acteri/alion is presented in terms of an infinite-order generator matrix [ 6 ]. 

Notice that the effective code rate for the foregoing example with 3-bil input 
sequence and 10 -bil output sequence is k/n = ^—quite a bit less than the rate 2 that 
might have been expected from the knowledge that each input data bit yields a pair 
of output channel bits. The reason for the disparity is that the final data bit into the 
encoder needs to be shifted through the encoder. All of the output channel bits are 
needed in the decoding process. If the message had been longer, say 31K) bits, the 
output codeword sequence would contain 6(14 bits, resulting in a code rale of 
.'^00/604—much closer to 3 . 

7«2«L2 Polynomial Representation 

Sometimes, the encoder connections are characterized by generator polynomi¬ 
als, similar to those used in Chapter 6 for describing the feedback shift register implc- 
menlalion of cyclic codes. We can represent a convolutional encoder with a set of n 
generator polynomials, one for each of the n modulo-2 adders. Each polynomial is of 
degree K - 1 or less and describes the connection of the encoding shift register to that 
moduk>2 adder, much the same way that a connection vector does. The coefficient of 
each term in the (K - I )-degree polynomial is either 1 orO, depending on whether a 
connection exists or does not exist between the shift register and the modulo -2 adder 
in question. For the encoder example in Figure 7.3. we can write the generator poly¬ 
nomial g|(A") for the uppcrconneclionsandgi(AO for the lowerconnections as follows: 

(..(A-) = 1 + 

K,{A') = \ +X‘ 

where the lowest order term in the polynomial corresponds to the input stage of 
the register. The output sequence is found as follows: 

V{X) = m(Ar)g,(A') interlaced with m(A')g:(A') 

First, express the message vector m « I 0 1 as a polynomial—that is. m{X) = \ + X^. 
We shall again assume the use of zeros following the message bits, to flush the reg¬ 
ister. Then the output polynomial U{A'). or the output sequence U, of the Figure 
7.3 encoder can be found for the input message m as follows: 

m(Algi(A) = {1 + + Ar+ .V-) = 1 + A’+ 

m(Alg:(X) = (1 + X'){1 ^ X-)=\+X* _ 

= 1 + A'+0^^+ X^ + X* 
ra(.V)gj(A) = I + ()A-+ OA’- + OX' + X* 

U(A') = (1.1) + (i.o)A'+ (o.o)Ar^+" (i.o)Ar^ + (i,i)Ar" 

u = 1 1 to no 10 11 
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In this example we started with another point of view—namely, that the convolu¬ 
tional encoder can be treated as a set of cyclic code shift registers. We represented 
the encoder with polynomial generators as used for describing cyclic codes. 
However, we arrived at the same output sequence as in Figure 7.4 and at the same 
output sequence as the impulse response treatment of the preceding section. 
(For a good presentation of convolutional code structure in the context of linear 
sequential circuits, see Reference [7].) 


7.2.2 State Representation and the State Diagram 

A convolutional encoder belongs to a class of devices known as finite-state ma¬ 
chines, which is the general name given to machines that have a memory of past 
signals. The adjective finite refers to the fact that there are only a finite number of 
unique slates that the machine can encounter. What is meant by the state of a 
finite-state machine? In the most general sense, the slate consists of the smallest 
amount of information that, together with a current input to the machine, can pre¬ 
dict the output of the machine. The stale provides some knowledge of the past sig¬ 
naling events and the restricted set of possible outputs in the future. A future slate 
is restricted by the past stale, For a rate l/« convolutional encoder, the stale is rep¬ 
resented by the contents of the rightmost K - I stages (see Figure 7.3). Knowledge 
ol the state together with knowledge of the next input is necessary and sufficient to 
determine the next output. Let the state of the encoder at lime be defined as X, = 

m,. .^ , i. The /ih crnlcword branch t/, is completely determined by 

slate X, and the present input bit m,\ thus the state A*, represents the past history of 
the encoder in determining the encoder output. The encoder stale is said to he 

Markov, in the sense that the probability P{X,.'^\X,. X, ,. Xf) of being in state 

A,, j. given all previous states, depends only on the most recent state A,; that is, the 
probability is equal to P(A, *, |A,). 

One way to represent simple encoders is with a state diagram; such a 
reprcsenlaticn for the encoder in Figure 7.3 is shown in Figure 7.1 The stales, 
shown in the boxes of the diagram. reprc.seni the possible contents of the right¬ 
most K - I stages of the register, and the paths between the stales represent 
the output branch words resulting from such .slate transitions. The slates of the 
register are designated a = 00. h = 10. c = 01. and d = 11; the diagram shown in 
Figure 7.5 illustrates all the stale transitions that are possible for the encoder in 
Figure 7.3. There arc only two tran.sitions emanating from each stale, correspond¬ 
ing to the two possible input bits. Next to each path between slates is written the 
output branch word associated with the stale transition. In drawing the path, we 
use the convention that h solid line denotes a path associated with an input bit. 
zero, and a dashed line denotes a path associated with an input bit. one. Notice that 
it is not possible in a single transition to move from a given state to any arbitrary 
state. As a consequence of shifting-in one bit at a lime, there are only two possible 
slate transitions that the register can make at each bit time. For example, if the 
present encoder slate is 01), the only possibdiiies for the state at the next shift are 00 
or 10. 
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00 



/ 
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\ 




-Input bit 1 


'^Q 


Figure 7.5 Encjoder stale diagram 
(rate K = 3) 


Example 7.1 Convolutiunal Knc^Kling 

For the encoder shown in Figure 7 . 3 , show the state changes and the resulting ouipui 
codeword sequence U for the message sequence m » 1 1 0 I 1 . followed by AC - 1 . = 2 
zeros to flush the register- Assume that the initial contents of the regisler are all zeros. 

Solution 


Branch 

word 


Input 
bit Wj 

Register 

contents 

State at 
Wmel, 

State at 
lime 

^ • 1 

at lime 

U\ 



000 

00 

00 



1 

100 

00 

10 

1 

i 

1 

1 10 

10 

11 

0 

1 

0 

Oil 

11 

01 

0 

t 

1 

1 0 1 

01 

10 

0 

1 

1 

1 10 

10 

11 

0 

1 

0 

01 1 

11 

Ul 

0 

1 

0 

oin 

IsL, 

state 

/.-i 

01 

00 

1 

1 

Output sequence: U - 

11 Oi 

U1 0 0 0 1 

01 

J1 
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Example 7.2 Convolutiunal Encoding 

In Example 7.1 ihe miiial conlents of ihe register are all zeros. This is equivalent to 
the condilion that the given input sequence is preceded by two zero bits (ihc encoding 
is a function of the present bit and the K - i prior bits). Repeat Example 7.1 with the 
assumption that the given input sequence is preceded by two one hits, and verify that 
now the codeword sequence V for input sequence m = I I 0 1 1 is different than the 
codeword found in Example 7 . 1 . 

Soluticm 


The entry “X** signifies “don't know.*" 


Input 
bit fn, 

Register 

contents 

Stale at 
lime i, 

State at 
lime 

Branch 
word 
at lime t. 


Ur 

— 

1 1 X 

I X 

1 1 

• 


} 

1 1 i 

1 1 

1 1 

I 

0 

1 

\ 1 1 

1 I 

1 1 

1 

0 

b 

01 1 

1 1 

01 

i) 

1 

1 

10 1 

01 

10 

0 

0 

1 

no 

1 0 

1 1 

0 

1 

(1 

0 1 I 

11 

0 1 

0 

1 

0 

001 

t\ 

01 

00 

I 

1 


f state 1, 






btatc 

/.., 





♦ > • 1 

Output sequence: \J = 

10 10 

01 00 01 

<11 

] 1 


By comparing thisi result with that of Example 7 , 1 , we can see that each branch 
word of the output sequence U is not only a function of the input bit, but is also a 
function of the K - 1 prior bits, 


7.2.3 The Tree Diagram 


Although the state diagram completely characteri/es the encoder, one cannot eas¬ 
ily use it for tracking the encoder transitions as a function of lime since the diagram 
cannot represent time history, The tree diagram adds the dimension of lime to the 
state diagram. The tree diagram for the convolutional encoder shown in Figure 7.3 
IS illustrated in Figure 7 . 6 . At each successive input bit lime the encoding proce¬ 
dure can be described by traversing the diagram from left to right, each tree branch 
describing an output branch word. TTie branching rule for finding a codeword se¬ 
quence is as follows: If the input bit is a 2ero. its associated branch word is found by 
moving to the next rightmost branch in the upward direction, if the input bit is a 
one. its branch word is found by moving to the next rightmost branch in the down¬ 
ward direction. Assuming that the initial contents of the encoder is all zeros, the 
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Figure 7.6 Tree representation 
of encoder (rate K= 3). 
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diagram shows that if the first input bit is a zero, the output branch word is 00 and, 
if the first input bit is a one, the output branch word is II. Similarly, if the first 
input bit is a one and the second input bit is a zero, the second output branch word 
is 10 . Or, if the first input bit is a one and the second input bit is a one, the second 
output branch word is 01 . Following this procedure we see that the input sequence 
110 11 traces the heavy line drawn on the tree diagram in Figure 7 . 6 . This path 
corresponds to the output codeword sequence 1 1010 100 0 1. 

The added dimension of time in the tree diagram (compared to the state dia¬ 
gram) allows one to dynamically describe the encoder as a function of a particular 
input sequence. However, can you see one problem in trying to use a tree diagram 
for describing a sequence of any length? The number of branches increases as a 
function of 2 ^ where L is the number of branch words in the sequence. You would 
quickly run out of paper, and patience. 


7.2.4 The Trellis Diagram 

Observation of the Figure 7.6 tree diagram shows that for this example, the struc¬ 
ture repeats itself at lime after the third branching (in general, the tree structure 
repeats after K branchmgs, where K is the constraint length). We label each node in 
the tree of Figure 7.6 to correspond to the four possible slates in the shift register, 
as tollows: a = 00 ./)= U), c » 01 , and = 11 . The first branching of the tree struc¬ 
ture, at lime /|, produces a pair of nodes labeled a and h. At each successive 
branching the number of nodes double. The second branching, at lime hy results in 
four nodes labeled a, b, c, and d. After the branching, there are a total of eight 
nodes: two arc labeled a, two are labeled two are labeled c. and two are labeled 
(/, We can see that all branches emanating from two nodes of the same state gener¬ 
ate identical branch word sequences. From this point on, the upper and the lower 
halves of the tree are identical. The reason for this should be obvious from exami¬ 
nation of the encoder in Figure 7 . 3 . As the fourth input bit enters the encoder on 
the left, the first input bit is ejected on the right and no longer influences the output 
branch words. Consequently, the input sequences 1 0 0 x y ... and 0 0 0 x y ... , 
where the leftmost bit is the earliest bit, generate the same branch words after the 
(K = 3 )rd branching. 'Fhis means that any two nodes having the same state label at 
the same lime t, can be merged, since all succeeding paths will be indistinguishable. 
If we do this to the tree structure of Figure 7 . 6 , we obtain another diagram, called 
the trellis diagram. The trellis diagram, by exploiting the repetitive structure, pro¬ 
vides a more manageable encoder description than does the tree diagram. The trel¬ 
lis diagram for the convolutional encoder of Figure 7.3 is shown in Figure 7 . 7 . 

In drawing the trellis diagram, we use the same convention that we intro¬ 
duced with the stale diagram—a solid line denotes the output generated by an 
input bit zero, and a dashed line denotes the output generated by an input bit one. 
The nodes of the trellis characterize the encoder slates: the first row nodes corre¬ 
spond to the stale a = (X), the second and subsequent rows correspond to the states 
= 10 , c = 01 , and d == 11 . At each unit of time, the trellis requires 2 ^" * nodes to 
represent the 2 ^ ‘ * possible encoder states. The trellis in our example assumes a 
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State a = 00 

fa = 10 

c = 01 

cf ^ 11 

Legend 

-Input bit 0 

-Input bit 1 


Figure 7.7 Encoder trellis digram (rate K = 3) 


fixed periodic structure after trellis depth 3 is reached (at time rj. In the general 
case, the fixed structure prevails after depth K is reached. At this point and there¬ 
after. each of the stales can be entered from either of two preceding slates. Alsu, 
each of the stales can transition to one of two states. Of the two outgoing branches, 
one corresponds to an input bit zero and the other corre.sponds to an input bit one. 
On Figure 7.7 the output branch words corresponding to the state transitions 
appear as labels on the trellis branches. 

One lime-interval section of a fully-formed encoding trellis structure com¬ 
pletely defines the code, The only reason for showing several sections is for viewing 
a code-symbol sequence as a function of time. The state of the convolutional en- 
coder is represented by the contents of the rightmost K - I stages in the encoder 
register. Some authors describe the slate as the contents of the leftmost K - 1 
stages. Which description is correct? They are both correct in the following sense. 
Every transition has a starting slate and a terminating stale. The rightmost K - 1 
stages describe the starting slate for the current input, which is in the leftmost stage 
(assuming a rate V/i encoder). The leftmost K - 1 stages represent the terminat¬ 
ing state for that transition, A code-symbol sequence is characterized by N 
branches (representing N data bits) occupying N intervals of lime and associated 
with a particular state at each of N + 1 limes (from start to finish). Thus, we launch 
bits at times h.. ■ •, and arc interested in state metrics at times f:, • •., i- 
The convention used here is that the current bit is located in the leftmost stage (not 
on a wire leading to that stage), and the rightmost K - 1 stages start in the all-zeros 
stale. We refer to this lime as the srart time and label it /^. We refer to the conclud¬ 
ing time of the last transition as the terminating lime and label it 1, 
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7.3 FORMULATION OF THE CONVOLUTIONAL 
DECODING PROBLEM 

7.3.1 Maximum Likelihood Decoding 

If all input message sequences are equally likely, a decoder that achieves the mini¬ 
mum probability of error is one that compares the conditional probabilities, also 
called the liketihuud functions /’(ZlU'"’*), where Z is the received sequence and 
U""’ is one of the possible transmitted sequences, and chooses the maximum. The 
decoder chooses U*'" ' if 


P(Z|U''"'>) = max P{Z!U''"') 
over all U'"" 

The maximum likelihood concept, as slated in Equation ( 7 . 1 ), is a fundamental 
development of decision theory (see Appendix B); it is the formalization of a 
‘'common-sense'' way to make decisions when there is statistical knowledge of the 
possibilities. In the binary demodulation treatment in Chapters 3 and 4 there were 
only two equally likely possible signals, .r |(0 or 52 ( 0 . that might have been transmit¬ 
ted, Therefore, to make the binary maximum likelihood decision, given a received 
signal, meant only to decide that 5 ,(f) was transmitled if 

P{z\5x) > P{Z\S2) 

Otherwise, to decide that 52(0 was transmitted. The parameter z represents z(T), 
the receiver predeteciion value at the end of each symbol duration time (= T. How¬ 
ever, when applying maximum likelihood to the convolutional decoding problem, 
we observe that the convolutional code has memory (the received sequence repre¬ 
sents the superposition of current bits and prior bits). Thus, applying maximum 
likelihood to the decoding of convolutionally encoded bits is performed in the con¬ 
text of choosing the most likely sequence, as shown in Equation ( 7 . 1 ). 'ITiere arc 
typically a multitude of possible codeword sequences that might have been trans¬ 
mitted. To be specific, for a binary code, a sequence of L branch words is a member 
of a set of 2 ^ possible sequences. Therefore, in the maximum likelihood context, we 
can say that the decoder chooses a particular as the transmitted sequence if 
the likelihood F(Z|U^"^ ^) is greater than the likelihoods of all the other possible 
transmitted sequences. Such an optimal decoder, which minimizes the error proba¬ 
bility (for the case where all transmitted sequences are equally likely), is known as 
a maximum likelihood decoder. The likelihood functions are given or computed 
from the specifications of the channel. 

We will assume that the noise is additive white Gaussian with zero mean and 
the channel is memoryless, which means that the noise affects each code symbol 
independently of all the other symbols. For a convolutional code of rate l/n, we 
can therefore express the likelihood as 

P(Z|i;''">) = flP{Z,\U^f) = n ( 7 . 2 ) 

•=i ;^1 
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where Z/ is the /Ih branch of the received sequence Z. is the i\h branch of a 
particular codeword sequence z,, is the yth code symbol of Z,, and is the 
;lh code symbol of t//'"'. and each branch comprises n code symbols. The decoder 
problem consists of choosing a path through the trellis of Figure 7.7 (each possible 
path defines a codeword sequence) such that 

n n is maximized (7,3) 

j - 1 ^-1 

Generally, it is computationally more convenient to use the logarithm of the 
likelihood function since thus permits the summation, instead of the multiplication, 
of terms, We are able to use this transformation because the logarithm is a monolo- 
nically increasing function and thus will not alter the final result in our codeword 
selection. We can defme the log-likelihood function as 

^>0 =iog/’(ziu"'>) = J;log=1; i iogP(-„uc'} (7.4) 

I' 1 ' 1 ; I 

The decoder problem now consists of choosing a path through the tree uf Figure 
7.6 or the trellis of Figure 7.7 such that yifyn) is maximized. For the decoding of 
convolutional codes, either the tree or the trellis structure can be used. In the tree 
representation of the code, the fact I hat the paths remerge is ignored. Since for a 
binary code, the number of possible sequences made up of L branch words is 2', 
maximum likelihood deccxiing of such a received sequence, using a tree diagram, 
requires the ‘'brute force" or exhaustive comparison of 2 ^ accumulated log- 
likelihood metrics, representing all the ptissible different codeword sequences that 
could have been iransmilled. Hence it is not practical to consider maximum likeli¬ 
hood decoding with a tree structure. It is shown in a later section that with the use 
of the trellis representation of the code, it is possible to configure a decoder which 
can discard the paths that could not possibly be candidates for the maximum likeli¬ 
hood sequence. The decoded path is chosen from some reduced set of surviving 
paihs. Such a decoder is still optimum in (he sense that the decoded path is the 
same as the decoded path obtained from a “brute force" maximum likelihood de¬ 
coder. but the early rejection of unlikely paths reduces the decoding complexity. 

For an excellent tutorial on the structure of convolutional codes, maximum 
likelihood decoding, and code performance, see Reference [ 8 ). There are several 
algorithms that yield approximate solutions to the maximum likelihood decoding 
problem, including sequential |9,10| and threshold [llj. Each of these algorithms is 
suited to certain special applications, but are all suboptimal. In contrast, the Virerhi 
decoding algorithm performs maximum likelihood decoding and is therefore opti¬ 
mal, This does not imply that the Vilerbi algorithm is best for every application; 
there are severe constraints imposed by hardware complexity. The Vilerbi 
algorithm is considered in Sections 7.3 J and 7.3,4. 

7.3.2 Channel Models: Hard versus Soft Decisions 

Before specifying an algorithm that will determine the maximum likelihood deci¬ 
sion, let us describe the channel. The codeword sequence made up of branch 
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words, with each branch word comprised of n code symbols, can be considered to 
be an endless stream, as opposed lo a block code, in which the source data and 
their codewords are partitioned into precise block sizes. The codeword sequence 
shown in Figure 7.1 emanates from the convolutional encoder and enters the mod¬ 
ulator, where the code symbols are transformed into signal waveforms. The modu¬ 
lation may be baseband (e.g., pulse waveforms) or bandpass (e.g., PSK or FSK). In 
general, ( symbols at a time, where ( is an integer, are mapped into signal wave¬ 
forms s^(/), where i = 1.2,..., A/ = 2^ When f = 1, the modulator maps each code 
symbol into a binary waveform. The channel over which the waveform is transmit¬ 
ted is assumed to corrupt the signal with Gaussian noise. When the corrupted sig¬ 
nal is received, it is first processed by the demodulator and then by the decoder. 

Consider that a binary signal transmitted over a symbol interval (0. T) is repre¬ 
sented by A'i(0 for a binary' one and for a binary zero. The received signal is r(f) = 
Si(t) + n(f), where n(t) is a zero-mean Gaussian noise process. In Chapter 3 we de¬ 
scribed the detection of r(/) in terms of two basic steps. In the first step, the received 
waveform is reduced to a single number. z(T) = j, + n,,, where a, is the signal compo¬ 
nent of 2 (r) and no is the noise component. The noise component, is a zero-mean 
Gamsian random variable, and thus z{T) is a Gatmian random variable with a mean 
ot either or a^ depending on whether a binary one or binary zero was sent. In the 
second step of the detection process a decision was made as to whidi signal was trans¬ 
mitted, on the basis of comparing z( T) to a threshold. The conditional probabilities 
of z(7'),p(z[j,), andp(z|52) are shown in Figure 7,8. labeled likelihood of.V| and likeli¬ 
hood of $ 2 - The demodulator in Figure 7.1, converts the set of lime-ordered random 
variables (zCT)! into a code sequence Z. iind passes it on to the decoder. The demod¬ 
ulator output can be configured in a variety of ways. It can be implemented to make 
n firm or hard decision as to whether z{T) represents a zero or a one. In this case, the 
output of the demodulator is quantized to two levels, zero and one, and fed into the 
decoder (this is exactly the same threshold decision that was made in Chapters 3 


Likelihood of S 2 Likelihood of 5^ 
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Figure 7.6 Hard and soft decoding decisions. 
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and 4). Since the decoder operates on the hard decisions made by the demodulator, 
the decoding is called hard-deemof} decoding. 

The demodulator can z\so be configured to feed the decoder with a quantized 
value of z(T) greater than levels. Such an iniplemenialion furnishes the decoder 
with more information than is provided in the hard-decisibn case. When the quanti¬ 
zation level of the demodulator output is greater than two. the decoding is called 
soft'deci^ium decoding. Eight levels (3-hils) of quantization arc illustrated on the 
abscissa ol Figure 7.8. When the demodulator sends a hard binary decision to the 
decoder, it sends it a single binary symbol. When the demcjdulator sends a soft bi¬ 
nary decision, quantized to eight levels, it sends the deci-Mjer a 3-bit word describing 
an interval along ^(7 ). in effect, sending such a 3-hit word in place of a single bi¬ 
nary symbol is equivalent to sending the decoder a measure of confidence along 
with iho code-symbol decision. Referring to Figure 7.8, if the demodulator sends 
1 I 1 to the decoder, this is tantamount to declaring the code symbol to be a one 
with very high confidence, while sending a 1 00 is tantamount to declaring the code 
symbol lu he a one with very low confidence. It should be clear that ultimately, 
every message decision out of the decoder must be a hard decision: otherwise, one 
might see computer printouts that read; ‘ ihink it*s a I,’’ “think it's a 0." and so on. 
The idea behind the demodukiior not making hard decisions and sending more 
data (soft decisions) to the decoder can be thought of as an interim step to provide 
the decoder with more information, which the decoder then uses for recovering the 


message sequence (with better error performance than it could in the case of hard- 
decision decoding). In Figure 7.8, the 8-level soft-decision metric is often shown as 
^7* “5. -3, -1, 1.3. 5. 7. Such a designation lends itself to a simple interpretation of 
the soil decision: The sign of the metric represents a decision (e.g., choose S\ if 
positive, chouse -sy il negative), and the magnitude of the metric represents the 
confidence level of that decision. The only advantage for the metric shown in 
Figure 7.8 is that it avoids the use of negative numbers. 

For a Ciaussian channel, eight-level quantization results in a performance im¬ 
provement of approximately 2 dB in required signal-to-nuise ratio compared to 
fwo-lcvel quantization. This means that eight-level soft-decision decoding can pro¬ 
vide the same probability of bit error as that of hard-decision decoding, but 
requires 2 dB less the same performance. Analog (or infinite-level quan¬ 

tization) results in a2.2-dD performance improvement over two-level quantization: 
therefore, eight-level quantization results in a loss of approximately 0.2 dB com¬ 
pared to infinitely fine quantization. For this reason, quantization to more than 
eight levels can yield little performance improvement [12]. What price is paid for 
such improved sofl-decision-dccoder performance? In the case of hard-decision 
decoding, a single bit is used to describe each codt symbol, while for eight-level 
quantized soft-decision decoding 3 hits are used to describe each code symbol; 
therelorc. three times the amount of data must be handled during the decoding 
process. Hence the price paid for soft-decision decoding is an increase in required 
memory size at the decoder (and possibly a speed penalty). 

Block decoding algorithms and convolutional decoding algorithms have been 


devised to operate with hard or soft decisions. However, soft-decision decoding is 
generally not used with block codes because it is considerably more difficult than 
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hard-decision decoding to implement. The most prevalent use of soft-decision 
decoding is with the Viterbi convolurional decoding algorithm, since with Viierbi 
decoding, soft decisions represent only a trivial increase in compulation. 


7.3,2.1 Binary Syrametric Channel 

A binary symmetric channel (BSC) is a discrete memoryless channel (see 
Section 6.3.1) that has binary input and output alphabets and symmetric transition 
probabilities. It can be described by the conditional probabilities 


P(Oll) - P(l\0)=p 
P(I|I) = P(010)= 1 -p 


(7.5) 


as illustrated in Figure 7.9. The probability that an output symbol will differ from 
the input symbol is p, and the probability that the output symbol will be identical to 
the input symbol is (1 - p). The BSC is an example of a hard^decLsion channel, 
which means that, even though continuous-valued signals may be received by the 
demodulator, a BSC allows only firm decisions such that each demodulator output 
symbol, as shown in Figure 7.1. consists of one of two binary va]ue.s. The index¬ 
ing of Zj, pertains to the yth code symbol of the /th branch word, Z,. The demodula¬ 
tor then feeds the sequence Z * {ZJ to the decoder. 

Let be a transmitted codeword over a BSC with symbol error probability 
p. and let Z be the corresponding received decoder sequence. As noted previously, 
a maximum likelihood decoder chooses the codeword that maximizes the 
likelihood P(Z,U^'"^) or its logarithm. For a BSC. this is equivalent to choosing the 
codeword ^ that is closest in Hamniing distance to Z [H]. Thus Hamming dis¬ 
tance is an appropriate metric to describe the distance or closeness of fit between 

and Z. From all the possible transmitted sequences U"'\ the decoder chooses 
the ' sequence for which the distance to Z is minimum. 

Suppose that and Z are each L-bii-long sequences and that they differ in 
d,., positions (i.e., the Hamming distance between U'"*' and Z is d,„]. Then, since the 


Transition probabilities 



Figure 7.9 Binary symmetric channel (hard-decision channel). 
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channel is assumed to be memoryless. the probability that this was trans¬ 
formed to the specific received Z at distance il,„ from ii can be w ritten as 

P(Z'0’”O = pMl - (7.(>) 

and the log-likelih(H)d function is 

log PfZIlf"") = -rf,„log (-^) + /-log(l - P) (7.7) 

If we compute this quantity for each possible transmitted sequence, the last term in 
the equation will be constant in each case. Assuming that p < 0.5, we can express 
Equation (7.7) as 


logPlZlU"'') = - B (7.S) 

where A and B are positive constants. Therefore, choo.sing the codeword '. 
such that the Hamming distance <l,„ to the received sequence Z is minimised, corre¬ 
sponds to nuiximizifig the hkelihood or logdikelihood metric. Consequently, over a 
BSC, the log-likelihood metric is conveniently replaced by the Hamming distance, 
and a maximum likelihood decoder will choose, in the tree or trellis diagram, the 
path whose corresponding sequence is at the mmimum Hamming distance to 
the received sequence Z. 


7.3.2.2 Gaussian Channel 

For a Gaussian channel, each demodulator output symbol 2 ^,, as shown in 
Figure 7.1. is a value from a continuous alphabet. The symbol z,. cannot be labeled 
as a correct or incorrect detection decision. Sending the decoder such soft decisions 
can be viewed as sending a family of conditional probabilities of the different sym¬ 
bols (see Section 6.3.1). It can be shown [8] that maximizing P(Z is equivalent 
to maximizing the inner product between the codeword sequence (consisting 
of binary symbols represented as bipolar values) and the analog-valued received 
sequence Z. Thus, the decoder chooses the codeword ^ if it maximizes 


2 2 

-- I /-I 


(7.9) 


This is equivalent to choosing the codeword ‘ that is closest in Euclidean diS‘ 
lance to Z. Even though the hard- and soft-decision channels require different met- 
ric.s, the concept of choosing the codeword that is closest to the received 
sequence. Z, is the same in both cases. To implement the maximization of Equa¬ 
tion (7.9) exactly, ihc decoder would have to be able to handle analog-valued arilh* 
metic operations. This is impractical because the decoder is generally implemented 
digitally. Thus it is necessary to quantize the received symbols z,,^ Does Equation 
(7.9) remind you of the demodulation treatment in Chapters 3 and 4? Equation 
(7-9) is the discrete version of correlating an input received waveform, r(r). with a 
reference waveform, as expressed in Equation (4.15). The quantized Gaussian 


400 


Channel Coding: Part 2 Chap. 7 



channel, typically referred lo a sofi-decLsion channel is the channel model 
assumed for the sofl-decisiun decoding described earlier. 

7.3.3 The Viterbi Convolutional Decoding Algorithm 

The Viierbi decoding algorithm was discovered and analyzed by Viterbi [13] in 1967, 
The Viterbi algorithm essentially performs maximum likelihood decoding; how¬ 
ever. it reduces the computational load by taking advantage of the special structure 
in the code trellis. The advantage of Viterbi decoding* compared with brute-force 
decoding, is that the complexity of a Viterbi decoder is not a function of the num¬ 
ber of symbolb in the codeword sequence. The algorithm involves calculating a 
measure of similarity, or distance, between the received signal, at time r,. and all the 
trellis paths entering each state at time (,. The Viierbi algorithm removes from con¬ 
sideration lho.se trellis paths that could not possibly be candidates for the maxi¬ 
mum likelihood choice. When two paths enter the same state, the one having the 
best metric is chosen; this path is called the path. This sclecti(m of surviv¬ 

ing paths is performed for all the states. The decoder continues in this way lo ad¬ 
vance deeper into the trellis, making decisions by eliminating the least likely paths. 
The early rejection of the unlikely paths reduces the decoding complexity. In 196^. 
Omura [14J demonstrated that the Viterbi algorithm is. in fact, maximum likeli¬ 
hood. Note that the goal of selecting the optimum path can be expressed, equiva¬ 
lently. as choosing the codeword with the maximum likelihood metric, or as 
choosing the codeword with the minimum distance mciru. 

7.3.4 An Example of Viterbi Convolutional Decoding 

For simplicity, a BSC is assumed; thus Hamming distance is a proper dislancc mea¬ 
sure. The encoder for thus example is shown in Figure 7.3. and the encoder trellis 
diagram is shown in Figure 7.7. A similar trellis can be used lo represent the de¬ 
coder. as shown in Figure 7.10. We start al rime /| in the 00 state (llushing the en¬ 
coder between messages provides the decoder with starting-slate knowledge). 
Since in this example, there are only two possible transitions leaving any stale, not 
all branches need be shown initially. The full trellis structure evolves after time r,. 
The basic idea behind the decoding prcKedure can best he understood by examin¬ 
ing the Figure 7.7 encoder trellis in concert with the Figure 7.10 decoder trellis. For 
the decoder trellis it is convenient at each time interval, to label each branch with 
the Hamming distance between the received code symbols and the branch word 
corresponding to the same branch from the encoder trellis. The example in Figure 
7,10 shows a message sequence m. the corresponding codeword sequence U. and a 
noise corrupted received sequence Z = 11 01 01 10 01 .... The branch words seen 
on the encoder rrellis branches characterize the encoder in Figure 7.3. and arc 
known a priori lo both the encoder and the decoder. These encoder branch words 
are the code symbols that would be expected to come from the encoder output as a 
result of each of the slate transitions. The labels on the decoder trellis branches are 
accumulated by the decoder on the fly. That is, as the code symbols are received. 
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Input data sequence 

m: 

1 

1 

Transmitted codeword 

U: 

11 

01 

Received sequence 

Z: 

11 

01 


t^ 

State a = 00 

h = 10 ♦ 

C a 01 • « 

Cf a 11 • ♦ 


Branch 

metric 


Figure 7.10 Decoder trellis diagram (rate K^Z). 


each branch of the decoder trellis is labeled with a metric of similarity (Hamming 
distance) between the received code symbols and each of the branch words for that 
lime interval. From the received sequence Z. shown in Figure 7.10, wc see that ihc 
code symbols received at (following) time arc II. In order lo label the decoder 
branches at (departing) lime f| with the appropriate Hamming distance metric, wc 
look at the Figure 7.7 encoder trellis. Here we see that a slate 1)0 (K) transition 

yields an output branch word of 00. But we received 1), Therefore, on the decoder 
trellis wc label the state 00 -> 00 transition with Hamming distance between them, 
namely 2. Looking at the encoder trellis again, we see that a stale 00 ^ 10 transi¬ 
tion yields an output branch word of lU which corresponds exactly with the code 
symbols we received at lime 'Fherefore. on the decoder trellis, we label the .stale 
(«) ^ 10 transition with a Hamming distance of 0. In summary, the metric entered 
on a decoder trellis branch represents the difference (distance) between what was 
received and what "should have been" received had the branch word associated 
with that branch been transmitted. In effect, these metrics describe a correlation- 
like measure between a received branch word and each of the candidate branch 
words. We continue labeling the decoder trellis branches in this way as the symbols 
are received at each time The decoding algorithm uses these Hamming distance 
metrics to find the mosi likely (minimum distance) path through the trellis. 

The basis of Viterbi decoding is the following observation: If any two paths in 
the trellis merge to a single slate, one of them can always be eliminated in the 
search for an optimum path. For example. Figure 7.11 shows two paths merging at 
lime /5 to state 00. Let us define the cumiiiative Hamming path metric of a given 
path at time tf as the sum of the branch Hamming distance metrics along that path 
up to time In Figure 7.11 the upper path has metric 4: the lower has metric 1. The 
upper path cannot be a portion of the optimum path because the lower path, which 
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Figure 7.11 Path metrics for two merging paths. 


enters the same state, has a lower metric. This observation holds because of the 
Markov nature of the encoder stale: The present state summari/cs the encoder his¬ 
tory in the sense that previous states cannot affect future states or future output 
branches. 

At each time t, there arc 2^ ‘ stales in the trellis, where K is the constraint 
length, and each state can be entered hy means of two paths. Viierhi decoding con¬ 
sists of computing the metrics foi the two paths entering each stale and eliminotinfi 
one of them. This compulation is done for each of the 2^ ‘' slates or nodes at time r; 
then the decoder moves to time r,., and repeals the process. Ai a given time, the 
winning path metric for each slate is designated as ihc state metric for that stale at 
that time. The first few steps in our decoding example are as follows (see Figure 
7.12). Assume that the input data sequence m. codeword U, and received sequence 
Z are as shown in Figure 7.It). Assume that the decoder knows the correct initial 
state of the trellis. (This assumption is not necessary in practice, but simplifies the 
explanation.) At time /, the received code symbols are II. From state (K) the only 
possible transitions are to state IK) or state It), as shown in Figure 7.12a. State 00 
00 transition has branch metric 2: slate 00 10 transition has branch metric 0. At 

lime i 2 there are two possible branches leaving each stale, as shown m Figure 7.12b. 
I he cumulative metrics of these branches arc labeled state metrics T^. T^,, 1'^, and 
corresponding to the terminating slate. At time t^ in Figure 7.12c there are again 
two branches diverging from each stale. A.s a result, there are two paths entering 
each state at time / 4 . One path entering each slate can be eliminated, namely, the 
one having the larger cumulative path metric. Should metrics of the two entering 
paths be of equal value, one path is chosen far elimination by using an arbitrary 
rule. The surviving path into each slate is shown in Figure 7,12d. At this point in 
the decoding process, there is only a single surviving path, termed the common 
stem, between limes r, and N. Therefore, the decoder can now decide that the stale 
transition which occurred between /, and was <)0 10. Since this transition is 

produced by an input bit one. the decoder outputs a one as the first decoded bit. 


7,3 Formulation of the Convolutional Decoding Problem 


403 




State metrics 



State metrics 

h 

2 ^2 

2 

t2 1 

^3 

p _ *3 

a 9 00 ^— 

• 10 2 

a = UU 


^ 1 rt * *3 

s 

x 

s 

s 

X 


10 # 

rA»o 


\ 

r 6 = 3 



csOI • 

9 

\ 

II 




\ 

\ 

\ 

> 

s 



= 11 • 

• 

II 

0 


(a) 





a sOO 

6s10 

c«01 

11 


State metrics 




Vi, ^3 



Vd^2 


Rgure 7.12 Selection of survivor paths, (a) Survivors at 4 . (b) Survivors at 4 . 
(c) Metric comparisons at V (d) Survivors at U- (^) Metric comparisons at 4 . 
(1) Survivors at 4 . (g) Metric comparisons at 4 . (h) Survivors at 4 . 


Here we can see how the decoding of the surviving branch is facilitated by having 
drawn the trellis branches with solid lines for input zeros and dashed lines for input 
ones. Note that the first bit was not decoded until the path metric compulation had 
proceeded to a much greater depth into the trellis. For a typical decoder implemen¬ 
tation. this represents a decoding delay which can be as much as five times the con¬ 
straint length in bits. 

At each succeeding step in the decoding process, there will always be two 
possible paths entering each slate; one of the two will be eliminated by comparing 
the path metrics. Figure 7.12e shows the next step in the decoding process. Again, 
at time 4 there are two paths entering each slate, and one of each pair can be elimi¬ 
nated. Figure 7.12f shows the survivors at time ty Notice that in our example we 
cannot yet make a decision on the second input data bit because there still are two 
paths leaving the state 10 node at lime fj. At time 4 in Figure 7.12g we again see the 
pattern of remerging paths, and in Figure 7.12h we see the survivors at lime 4 . 
Also, in Figure 7-12h the decoder outputs one as the second decoded bit. corre- 
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Figure 7.12 (Continued) 

iiponding to the single surviving path between h and (y The decoder continues in 
this way to advance deeper into (he trellis and to make decisions on the input data 
bits by eliminating all paths but one. 

Pruning the trellis (as paths remerge) guarantees that there are never more 
paths than there arc states. For this example, verify that after each pruning in Fig¬ 
ures 7.12b. d. f, and h, there are only 4 paths. Compare this to attempting a ‘'brute 
force" maximum-likelihood sequence estimation without using the Vilerbi algo¬ 
rithm. In that case, the number of possible paths (representing possible sequences) 
is an exponential function pf sequence length. For a binary codeword sequence that 
has a length of L branch words, there are 2^ possible sequences. 

7.3.5 Decoder Implementation 

In the context of the trellis diagram of Figure 7.10. transitions during any one rime 
interval can be grouped into 2‘ disjoint cells, each cell depicting four possible 
transitions, where v ^ AC - 1 is called the encoder memory. For the K = 3 example, 
V = 2 and 2' "' = 2 cells. These cells are shown in Figure 7.13, where a. by c; and d 
refer to the states at time and a\ b\ c\ and d' refer to the slates at time |. 
Shown on each transition is the branch metric 6^,., where the subscript indicates that 
the metric corresponds to the transition from slate x to state y. These cells and the 
associated logic units that update the state metrics (F^j, where x designates a partic¬ 
ular state, represent the basic building blocks of the decoder. 
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Cell 2 


Cell 1 



7.3«5.1 A dd^Com parens elect Computation 

Continuing with the /C = 3,2-cell example. Figure 7.14 illustrates the logic unit 
that corresponds to cell 1. TTic logic executes the special purpose compulation 
called add’compare’selca (ACS). The state metric is calculated by adding the 
previous-time slate metric of state ti, to the branch metric and the previous¬ 
time state metric of state c, F^, to the branch metric This results in two possible 
path metrics as candidates for the new state metric The two candidates are 
compared in the logic unit of Figure 7.14. The largesl likelihood (smallest distance) 
of the two path metrics is stored as the new stale metric IV for state a. Also stored 
is the new path history for state a, where is the message-path history of the 
state augmented by the data of the winning path. 


Fo r,* 



To another logic unit To another logic unit 


Figure 7.14 Logic unit that implements the add^compare-select functions 
corresponding to cell # 1 . 
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Also shown in Figure 7.14 is the cell-1 ACS logic that yields the new stale 
metric F^' and the new path history /n^.. This ACS operation is similarly performed 
for the paths in other celts. The oldest bit on the path with the smallest state metric 
forms the decoder output. 


7.3.5.2 Add-Corapare*Select as seen on Ihe Trellis 

Consider the same example lhat was used for describing Viierbi decoding in 
Section 7.3.4. The message sequence was in = 1 1 0 1 I, the codeword sequence was 
U - 11 01 01 00 01, and the received sequence wa.s Z - 11 01 01 10 01. Figure 7.15 
depicts a decoding trellis diagram similar to Figure 7.10. A branch metric that la¬ 
bels each branch is the Hamming distance between the received ct>de symbols and 
the corresponding branch word from the encoder trellis. Additionally, the Figure 
7.15 trellis indicates a value at each stale t. and for each time from lime ti to t,y. 
which is a slate metric V,. We perform the add-com pa re-select (ACS) operation 
when there are two transitions entering a stale, as ihere are for times and later. 
For example at time the value of the state metric for slate a is obtained by incre¬ 
menting the state metric F^ = 3 at time h with the branch metric * 1 yielding a 
candidate value of 4. Simultaneously, the stale metric F,. = 2 at time r, is incre* 
menied with (lie branch metric = I yielding a candidate value of 3. The select 
operation of the ACS process selects the largest-likelihood (minimum distance) 
path metric as the new slate metric: hence, for stale a at lime the new slate met¬ 
ric is • 3. The winning path is shown with a heavy line and the path that has 
been dropped is shown with a lighter line. On the Ireilis of Figure 7,15, observe the 
state metrics from left to right. Verify lhat at each lime, the value of each stale met¬ 
ric is obtained by incrementing the connected state metric from the previous time 
along the winning path (heavy line) with the branch metric belween them. At some 
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poim in the trellis (after a lime inten'al of 4 or 5 times the constraint length), the 
oldest bits can be decoded. As an example, looking at lime tf, in Figure 7.15, we see 
that the minimum-distance state metric has a value of 1. From this state d, the win¬ 
ning path can be traced back to time and one can verify that ^hc decoded mes¬ 
sage is the same as the original message, by the convention that dashed and solid 
lines represent binary ones and 7,eros respectively. 

7.3.6 Path Memory and Synchronization 


The storage requirements of the V'ilerbi decoder grow exponentially with con¬ 
straint length K. For a code with rate l/n, the decoder retains a set of 2^' ' paths 
after each decoding step. With high probability, these paths will not be mutuallv 
disjoint very far back from the present decoding depth [12], All of the ' paths 
tend to have a common stem which eventually branches to the various stales. Thus 
if the decoder stores enough of the histor> of the 2^ ' ’ paths, the oldest bits on all 
paths will be the .same. A simple decoder implementation, then, contains a fixed 
amount of path history and outputs the oldest bit on an arbitrary path each lime it 
steps one level deeper into the trellis. The amount of path storage required is [ 12] 

ti = h2*^ ’ (7.10) 


where h is the length of the information bit path history per state. A refinement, 
which minimizes the value of uses the oldest bit on the most likely path as the 
decoder output, instead of the oldest bit on an arbiiniry palh. It has been demon¬ 
strated [12) that a value ol h of 4 or 5 limes the code constraint length is suflltienl 
for ncar-oplimum decoder performance. The storage requirement u is Ihe basic 
limitation on the implementation of Vilerbi decoders. Commercial decoders arc 
limited to a constraint length of about K - 10. Efforts to increase coding gain by 
lurlhor increasing constraint length arc met by the exponential increase in memory 
requirements (and complexity) that follows from Equation (7.10). 

Branch word synchronization is the process of determining the beginning of a 
branch word in the received sequence. Such synchroni/alion can take place without 
new information being added to the transmitted symbtil stream because the received 
data appear to have an excessive error rate when not synchronized. *rhereforc, a smi- 
pie way of accomplishing synchronization is to monitor some concomitant indication 
of this large error rate, that is, the rale at which the slate metrics are increasing or the 
rate at which the surviving paths in the trellis merge. The monitored parameters are 
compared to a threshold, and synchronization is then adjusted accordingly. 


7.4 PROPERTIES OF CONVOLUTIONAL CODES 

7.4.1 Distance Properties of Convolutional Codes 

Consider the distance properties of convolutional codes in the context of the sim¬ 
ple encoder in Figure 7.3 and its trellis diagram in Figure 7.7. We want to evaluate 
the distance between all possible pairs of codeword sequences. As in the case of 


408 


Channel Coding: Part 2 Chap. 7 



block codes {see Section 6.5.2). wc are interested in the minimum dmance between 
all pairs of such codeword sequences in the code, since the minimum distance is re¬ 
lated lo the error-correcting capability of the code. Because a convululional code is 
a group or linear code [6J, there is no loss in generality in simply finding the mini¬ 
mum di.siance between each of the codeword sequences and the all-zeros sequence. 
In other words, for a linear code, any test message is just as ’‘good’' as any other 
lest message. So, why not choose one that is easy to keep track of—namely, the all- 
zeros sequence? Assuming that the all-zeros input sequence was transmitted, the 
paths of interest arc those that start and end in the ()0 slate and do not return to the 
no slate anywhere in between. An error will occur whenever the distance of any 

• r 

Other path that merges with the a - 00 stale at lime t, is less than (hat of the all- 
zeros path up to lime r,. causing the all-zeros pMh to be discarded in the decoding 
process. In other words, given the all-zeros transmission, an error (occurs whenever 
the all-zero,s path does not survive. Thus, an error of interest is associated with a 
surviving path that diverges from and then remerges to the all-zeros path. One 
might ask. Why is it necessary for the path lo remerge? Isn't the divergence enough 
(o indicate an error? Yes. of course, but an error characlerized bv only a diver¬ 


gence means that the decoder, from that point on. will be oulpuiling “garbage'* for 
ihc rest of the message duration. We want to quantify the decoder's capability in 
terms of errors that w'ill usually lake place—that is, we w'ant lo learn the ‘•easiest** 
way for the decoder to make an error. The minimum distance for making such an 
error can be found by exhaustively examining every path from the (K) state to the 
(X) stale, First, lei us redraw the trellis diagram, shown in Figure 7.16, labeling each 
branch with its Hamming distance from the all-zeros codewi>rd instead of with its 
branch word symbols. The Hamming distance helween two unequal-length se¬ 
quences will be found by first appending the necessary number of zeros to (he 
shorter .sequence lo make the two sequence.s equal in length. C onsider all the paths 
that diverge from the all-zeros path and then remerge for the first lime at some ar¬ 
bitrary node. From Figure 7.16 we can compute the distances of these paths from 
the all-zeros path. There is one path at distance S from the all-zeros path: this path 
departs from the all-zeros path at lime /| and merges with it at lime ( 4 . Similarly, 
there are two paths at distance 6, one which departs at lime t\ and merges at lime N. 
and the other vs'hich departs at time and merges at lime and so on. We can also 
sec from (he dashed and solid lines of the diagram that the input bits for the 
distance 5 path are 1 0 0: it differs in only one input bit from the all-zeros input 
sequence. .Similarly, the input bits for the distance 6 paths are I 1 0 0 and 10 10 0: 
each differs in two positions from the all-zeros path. The minimum distance in the 
set of all arbitrarily long paths that diverge and remerge, called the minimum free 
distance, or simply the free distance, is seen lo be 5 in this example, as shown with 
the heavy lines in Figure 7.16. For calculating the error-correcting capability of the 
code, we repeat Equation (6.44) w'ith the minimum distance df,un replaced by the 
free distance d, as 


r - 



(7.11) 
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State 



a = 00 

6 = 10 

f = 01 

11 


Figure 7.16 Trellis diagram, labeled with distances Irom the all-zeros path. 


where L-tJ means the largest integer no greater than x Setting (lf= 5, we see that the 
code, characterized b> the Figure 7.3 encoder can correct anv two channel errors. 
(See Section 7.4.1.t.) 

A trellis diagram represents “the rules of the game.'' It is a shorthand descrip¬ 
tion of alt the possible transitions and their corresponding start and finish states as¬ 
sociated with a particular finite-state machine. The trellis diagram offers some 
insight into the benefit (coding gain) when using error-correction coding. Consider 
Figure 7.16 and the possible divergence-remcrgence error paths. From this picture 
one sees that the decoder cannot make an error in any arbitrary wav. The error 
path must follow one of the allowable transitions. The trellis pinpoints all such al¬ 
lowable paths. By having encoded the data in this way, we have placed constraints 
on the transmitted signal. The decoder knows these constraints, and this knowl¬ 
edge enables the system to more easily (using les.s meet some error perfor¬ 

mance requirements. 

Although Figure 7.16 presents the computation of free distance in a siraiglif- 
forward way, a more direct closed-form expression can bo obtained by starling with 
the stale diagram in Figure 7.5. First, we label the branches of the slate diagram as 
either D" or shown in Figure 7.17, where the exponent of D denotes the 

Hamming distance from the branch word of that branch to the all-zeros branch. 
The self-loop at node a can be eliminated since it contributes nothing to the dis¬ 
tance properties of a codeword sequence relative to the all-zeros sequence. Fur¬ 
thermore, node a can be split into two nixlcs (labeled a and c). one of which 
represents the input and the other the output of the state diagram. All paths origi¬ 
nating at n = (X) and terminating at e = 00 can be traced on the modified slate dia¬ 
gram of Figure 7.17. We can calculate the transfer function of path a h c e (starting 
and ending at slate (X)) in terms of the indeterminate ‘p'^'^^holder** D, as D- D D- 
= D^. The exponent of D represents the cumulative tally of the number of ones in 
the path, and hence the Hamming distance from the all-zeros path. Similarly, the 


410 


Channel Coding: Part 2 Chap. 7 




Figure 7.17 State diagram, la¬ 
beled according to distance from 
the all-zeros path. 


00 


/ \ 



/ \ 


10 \ io 

/ 


paths a h d c e and a h c b c e each have the transfer function Z)'’ and thus a Ham¬ 
ming distance of 6 from the all-zeros path. Wc now write the slate equations as 


a; = D'X, + X 
X, * OX^ + DX^ 

X,, = oa; + Dx, 
X, = D-X. 


(7.12) 


where A'„.Af^ are dummy variables tor the partial paths to the intermediate 

nodes, The transfer function, T{ D). somelimcs called Ihe generating function of the 
code can be expressed as TfD) = XJX.,. By solving the stale equations shown in 
Equation (7.12), we obtain |15, 16] 

= (7.13) 


*£)*' + 2Zy + 4Z>^ + ••• + + • 


The transfer function for thi&code indicates that there is a single path of distance 5 
from the all-zeros path, two of distance 6. four of distance 7. and in general, there 

are 2^ paths of distance ( + 5 from the all-zeros path, where f « U, 1,2.The free 

distance d, of the code is the Hamming weight of Ihe lowesl-order term in the ex¬ 
pansion of T(D). In this example In evaluating distance properties, the trans¬ 
fer function, T{D). cannot be used for long constraint lengths since the complexity 
of T(D) increases exponentially with constraint length. 

The transfer function can be used to provide more detailed information than 
just the distance of the various paths. Let us introduce a factor L into each branch 
of the state diagram so that the exponent of L can serve as a counter to indicate the 
number of branches in any given path from slate = (X) to stale e ® 00. Further¬ 
more. we can introduce a factor N into all branch transitions caused by the input bit 
one. Thus, a.s each branch is traversed, the cumulative exponent on N increases by 
one, only if that branch transition is due to an input bit one. For the convolutional 
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code characterized in the Figure 7.3 example, the additional factors L and N are 
shown on the modified stale diagram of Figure 7.18. Equations (7.12) can now be 
modified as follows: 


A'ft = D'LNX, + LNX, 

X, = DLXf + DLXj 
ATrf = DLNXf, + DLNX,i 
X, = D-LX, 

The transfer function of this augmented stale diagram is 


(7.14) 


TID.L.N) = 


Dt-L^N 


1 - DL{\ + L)N 

= Dr^L^N + D^L\\. + L)N' + D^L^(\ + LfN^ 


(7.15) 


-t- D 


,<•*57 I 


Thus, we can verify some of the path properties displayed in Figure 7.16. There is 
one path of distance 5, length 3. which differs in one input bit from Ihc all-zeros 
path. There are two paths of distance 6, one of which is length 4. the other length 5. 
and both differ in two input bits from the all-zeros path. Also, of the distance 7 
paths, one is of length 5. two are of length 6, and one is of length 7; all four paths 
correspond to input sequences that differ in ihree input bits from the all-zeros path. 
Thus if the all-zeros path is the correct path and the noise causes us to choo.sc one 
of the incorrect paths of distance 7, three bit errors will be made. 



Figure 7.18 State diagram» 
labeled according to distance* 
length* arKl number ot Input ones. 


7.4L1 Error-Correefing Capability of Convolutional Codes 

In the study of block codes in Chapter 6, we saw that the error-correcting ca- 
pabUity, f. represented the number of code symbol errors that could, with maxi¬ 
mum likelihood decoding, be corrected in each block length of the code. However, 
when decoding convolutional codes, the error-correcting capability cannot be 
stated so succinctly. With regard to Equation (7.11), we can say that the code can, 
with maximum likelihood decoding, correct / errors within a few constraint lengths. 
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where few here means 3 lo 5. The exact length depends on how the errors are 
distributed. For a particular code and error pattern, the length can be bounded 
using transfer function methods. Such bounds are described later. 

7.4.2 Systematic and Nonsystematic Convolutional Codes 

A sysiematic convolutional code is one in which the input it-tuple appears as pan of 
the output branch word /?-tuple associated with lhat Ar-tuple. Figure 7.19 shows a bi¬ 
nary, rale k, K - 3 systematic encoder. For linear block codes, any nonsystematic 
code can be transformed into a systematic code with the same block distance prop¬ 
erties. This is not the case for convolutional codes. The reason for this is lhat con¬ 
volutional codes depend largely on free (lisraHcc\ making the convolutional code 
systematic, in general, reduces the maximum possible free distance for a given 
constraint length and rate. 

Table 7.1 shows the maximum free distance for rale 2 systematic and nnnsys- 
tematic codes for /C = 2 through 8. For large constraint lengths the results are even 
more widely separated |17). 


Figure 7<19 Systematic convolu¬ 
tional encoder, rate K s 3. 



T AB LE 7 .1 Compaoson of Sy sternal ic a nd 
Nonsystematic Free Oistance. Rale 4 


ConsiTsjnt 

Length 

Free DKiunce 
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Free Distance 
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Source: J. Vilerhi and J. K. Onmra. Pnncipk's t>l Dignal 

CommuniLaiion and Coding. McGra«-HiU Bwlc Com- 
pany. New York. 1979, p. 25\ 
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7.4.3 Catastrophic Error Propagation 
in Convoiutionai Codes 

A cutastrophic error is defined as an event whereby a finite number of code symbol 
errors cause an infinite number of decoded data bit errors. Massey and Sain [IS] 
have derived a necessary and sufficient condition for convolutional codes to display 
catastrophic error propagation. For rate l/n codes with register taps designated by 
polynomial generators, as described in Section 7.2.1. the condition for catastrophic 
error propagation is that the generators have a common polynomial factor (of 
degree at least one). For example, Figure 7.20a illustrates a rate i. = 3 encoder 
with upper polynomial gifX) and lower polynomial fc<30, as follows: 

g,{X) * 1 -e A' (7.16) 

g,(X) - 1 + JT' 

The generators g\{X) and g:(X) have in common the polynomial factor 1 + X, since 

I + A'" = (1 + A')(l + X) 

Therefore, the encoder in Figure 7.20a can manifest catastrophic error propagation. 




Figure 7.20 Encoder displaying cata^ 
strophic error propagation, (a) Encoder, 
(b) State diagram. 
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In terms of the state diagram for any-rate code, catastrophic errors can occur 
if, and only if, any closed-loop path in the diagram has zero weight (zero distance 
from the all-zeros path). To illustrate this, consider the example of Figure 7.20. The 
state diagram in Figure 7^0h is drawn with the state ^ = 00 node split into two 
nodes, a and e, as before. Assuming lhal the all-zeros path is the correct path. Ihe 
incorrect path ah d d ^^ d c e has exactly 6 ones, no matter how many times we go 
around the self-loop at node d. Thus for a BSC. for example, three channel errors 
may cause us to choose this incorrect path. An arbitrarily large number of errors 
(two plus the number of times the self-loop is traversed) can be made on such a 
path. We observe that for rale l//i codes, if each adder in the encoder has an even 
number of connections, the self-loop corresponding to the all-ones data state will 
have zero weight, and consequently, the code will he catastrophic. 

The only advantage of a systematic code, described earlier, is that it can never 
be catastrophic, since each closed loop must contain at leasl one branch generated 
by a nonzero input bit. and thus each closed loop must have a nonzero code sym¬ 
bol. However, it can be shown [19] that only a small fraction of nonsystem- 
alic codes (excluding those where all adders have an even number of taps) are 
catastrophic. 

7.4.4 Performance Bounds for Convolutional Codes 


The probability of bit error. fora binary convolutional code using hard-decision 
decoding can be shown [8] to be upper bounded as follows: 



dJ{D. N) 
dN 


.V* 1.0 ■2Vp(J - p\ 


(7.J7) 


where p is the probability of channel symbol error. For the example of Figure 7.3, 
T{D. N) is obtained from T{D, L. N ) by setting £. = I in Equation (7.15). 


7\D.N) = 


1 - 2DN 


(7.18) 


and 


f/TTD. N) 
dN 


O' 


(1-2D)‘ 

Combining Equations (7.17) and (7.19). we can write 

{2[p(I 




(7.19) 


(7.20) 


- A{p(\ - p)fPf 

For coherent BPSK modulation over an additive while Gaussian noise 
(AWGN) channel, it can be shown [8] that the bit error probability is bounded by 



Pa^Q\^2d,^\cxp[dr^ 


E,\ dT[D.N) 




' NJ dN 




(7.21) 


7.4 Properties of Convolutional Codes 


415 



where 


£,/iVo = rE^/N(^ 

£f, Wo = information bit energy to noise power spectral density 

Wo = ratio of channel symbol energy to noise power speclral density 
r =r kin = rate of the code 

and Q(x) is defined in Equations (3.43) and (3.44) and tabulated in Table B.l. 
Therefore, for the rate 5 code with free distance ilf= 5. in conjunction with coherent 
BPSK and hard-decision decoding, we can write 


^ J [SeA /5EA e xp(-5£,/2No) 

Al, [inJ [1 - 2exp(-£,/2/V„)i^ 

Q(V^;/Nn) 

^ [1 - 2exp{-£^/2/V„)p 


(7.22) 


7.4.5 Coding Gain 

Coding gain, as presented in Equation (6.19), is defined as the reduction, usually 
expressed in decibels, in the required £^/A/n to achieve a specified error probability 
ol the coded system over an uncoded system with the same modulation and chan* 
nel characteristics. Table 7.2 lists an upper bound on the coding gains, compared to 
uncoded coherent BPSK. for several maximum free distance convolutional codes 
with constraint lengths varying from 3 to 9 over a Gaussian channel with hard- 
decision decoding. The table illustrates that it is possible l<^ achieve significant cod¬ 
ing gain even with a simple convolutional code. The actual coding gain will vary 
with the required bit error probability jZO). 

Table 7.3 lists the measured coding gains, compared to uncoded coherent 
BPSK, achieved with hardware implementation or computer simulation over a 
Gaussian channel with soft-decision decoding (21]. The uncoded is given in 
the leftmost column. From Table 7.3 wc can see that coding gain increases as the 
bit error probability is decreased. However, the coding gain cannot increase indefi¬ 
nitely; it has an upper bound as shown in the table. This hound in decibels can be 
shown [ 21 ] to be 

coding gain •£ 10 logm (rdf) (7.23) 

where r is the code rale and t/, is the free distance. Examination of Table 7.3 also 
reveals that at = 10 '\ for code rales of ^ and i the weaker codes tend to be 
closer to the upper bound than are the more powerful codes. 

Typically. Viterbi decoding is used over binary input channels with either 
hard or 3-bil soft quantized outputs. The constraint lengths vary between 3 and 9. 
the code rate is rarely smaller than i and the path memory is usually a few con- 
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TABLE 7.2 Coding Gain Upper Bounds for Some Convolutional Codes 



Rale ^ Codes 



Rale 1 Codes 

K 

<‘f 

Upper Bound (dB) 

K 

d. 

Upper Bound (dB) 

3 

5 

3,97 

3 

8 

4.26 

4 

6 

4-76 

4 

JO 

5.23 

5 

7 

5.43 

5 

12 

6.02 

6 

8 

6 .(X) 

6 

13 

6.37 

7 

10 

6.99 

7 

15 

6.99 

H 

10 

6.99 

8 

16 

7.27 

9 

12 

7-78 

9 

18 

7.7« 


Source: V. K. Bhargava. D. Haccoun. R. Msiya^. and P. NuspI, Oigiiaf Communications by 
Sarellite. John Wilc> & Sons. Inc., New Yoiii, 1981. 


strainl lengths [12|. The path mcmor> refers to the depth of the input l>it history 
stored by the decoder. From the Viierbi decoding example in Section 7.3.4. one 
might question the notion of a fixed path memory. It seems from the example that 
the decoding of a branch word, at any arbitrary node, can take place as soon as 
there is only a single surviving branch at that node. That is true; however, to actu¬ 
ally implement the decoder in this way would entail an extensive amount of pro¬ 
cessing to continually check when the branch word can be decoded. Instead, a fixed 
delay is provided^ after which the branch word is decoded. It has been shown (12, 
22 ] th.m tracing back from the .stale with the lowest stale metric, over a fixed 
amount of path history (about 4 or 5 limes the constraint length), is sufficient to 
limit the degradation from the optimum decoder performance lo about 0.1 dB for 
the BSC and Gaussian channels. Typical error performance simulation results are 
shown in Figure 7,21 for Viierbi deciding with hard decision quantization [12]. No¬ 
tice that each increment in constraint length improves the required by a fac¬ 
tor of approximately 0.5 dB at Pq = 10'*'. 


TABLE 7.3 Basic Coding Gain (dB) for Soft Decision Vitert>i Decoding 


Untoded 

£/.//Vn 

(dB) 

Code Rale 

\ 


1 


i 

5 



i 

Pb K 

7 

8 

5 

6 

7 

6 

8 

6 

9 

6.8 

iir- 

4.2 

4.4 

3.3 

3.5 

3.8 

2.9 

3-1 

2.6 

2.6 

9.6 

10 -' 

5.7 

5.9 

4.3 

46 

5-1 

4.2 

4.6 

3.6 

4.2 

11,3 

10 ' 

6.2 

6-5 

4.9 

5J 

5.8 

4.7 

5.2 

3,9 

4.8 

Upper hound 

7.0 

7.3 

5.4 

6.0 

7.0 

5.2 

6.7 

4.8 

5.7 


Source: I. M. Jacobs, “Practical Applications of Coding.” IEEE Trans. Inf. Theory, vol. IT20, May 1974, pp. 
305^310. 
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Figure 7.21 Bit error probability versus for rate § codes using 
coherent BPSK over a BSC. Vrterbi decoding, and a 32-blt path mem¬ 
ory. (Reprinted with permission from J. A. Heller and I. M. Jacobs. 

'^Viterbi Decoding for Satellite and Space Communication." IEEE Trans. 
Commun. Te<^nol.. vol. COM19. no. 5. October 1971, Fig. 7. p. 84. © 

1971 IEEE.) 

7.4.6 Best Known Convolutional Codes 

The connection vectors or polynomial generators of a convolutional code are usu¬ 
ally selected based on the code's free distance properties. The first criterion is to 
select a code that does not have catastrophic error propagation and that has the 
maximum free distance for the given rate and constraint length. Then the number 
of paths at the free distance dy. or the number of data bit errors the paths represent, 
should be minimized. The selection procedure can be further refined by consider¬ 
ing the number of paths or bit errors at dy+ 1 , a! dy + 2 , and so on. until only one 
code or class of codes remains. A list of the best known codes of rale 5 , /C - 3 to 9, 
and rate f, ^ = 3 to 8 , based on this criterion was compiled by Odenwalder [3, 23] 
and is given in Table 7.4. The connection vectors in this table represent the pres- 
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ence or absence (1 or 0) of a tap connection on the corresponding stage of the con¬ 
volutional encoder, the leftmost term corresponding to the leftmost stage of the en- 


coder register. It is interesting to note that these connections can be inverted 
(leftmost and rightmost can be interchanged in the above description). Under 
the condition of Vilerbi decoding, the inverted connections give rise to codes 
with identical distance properties, and hence identical performance, as those in 
Table 7.4. 

TABLE 7.4 Optimum Short Constraint Length Convolutional Codes 
(Rate § and Rate $) 

Rate Constraint Length 

Free Distance 

Code Vector 


5 

111 

IDl 

i 4 

6 

1111 

1011 

i 5 

7 

10111 

niHM 


8 

101111 

110101 

S 7 

10 

14X)1111 

iionoi 

i H 

lU 

10011111 

1110010] 

1 9 

12 

iioiotin 


lOOUlllOl 

111 



3 

8 

HI 




101 




Mil 

4 

10 

ion 




1101 


, 


inii 

5 

12 

non 




lOlOl 

1 



loni 

6 

13 

110101 


111410] 


miiu 


1 

a 

7 

15 

lOIOHI 




1101101 




IllOlUl 

3 

8 

16 

10011011 




lOl(JlOI)] 


Source: J. P. Odenwalder. Errf>r Control Coding Handbook. Link a hi I Corp., 
San Diego, Calif., July 15,1976. 
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7.4.7 Convolutional Code Rate Trace-Off 


7.4.7.1 PiTformance with Coherent PSK Signaling 

The error-correcting capability of a coding scheme increases as the number of 
channel symbols n per information bit k increases, or the rate k/n decreases. How¬ 
ever. the channel bandwidth and the decoder complexity both increase with n. TTic 
advantage ot lower code rates when using convolutional codes with coherent FSK, 
is that the required £/,/An is decreased (for a large range of code rales), permitting 
ihe transmission of higher data rates for a given amount of power, or permitting re¬ 
duced power for a given data rate. Simulation studies have shown (16. 22] that for a 
fixed constraint length, a decrease in the code rate from ! to \ results in a reduction 
of the required E^,/N„ of roughly 0,4 dB. However, the corresponding increase in 
decoder complexity is ah<*>ut 17%. For smaller values of code rate, the improve¬ 
ment in performance relative to the increased decoding complexity diminishes 
rapidly [22], Eventually, a point is reached where further decrease in code rale is 
characterized by a reduction in coding gain. (See Section 9.7.7.2.) 

7.4.7.2 Performance with Noncoherent Orthogonal Signaling 

In contrast to PSK, there is an optimum cixle rate of aK^ul \ for noncoherent 
orthogonal signaling. Error pcrf<jrmance at rates of 1and j are each worse than 
those for rate 4. For 0 fixed constraint length, the rale t i and ^ codes typically 
degrade by about l).25. 0.5, and 0.3 dB. respectively, relative to the rate i perfor¬ 
mance (16). 

7.4.8 Soft-Decision Viterbi Decoding 

For a rate i binary convolutional code system. Che demodulator delivers two coda 
symbols at a time to the decoder. For hard-decisiun {2-)ev€l) decoding, each pair of 
received code symbols can be depicted on a plane, as one of the corners ol a 
square, as shown in Figure 7.22a. T^e corners arc labeled with the binary numbers 
(0.0), (0.1). (1.0), and (1.1), representing the four possible hard-decision values that 
the two code symbols might have. For 8-level soft-decision decoding, each pair of 
code symbols can be similarly represented on an equally spaced K-level by 8-level 
plane, as a point from the set of 64 points shown in Figure 7.22b. In this 
sofl-decision case, the demodulator no longer delivers firm decisions; it delivers 
quantized noisy signals (soft decisions!. 

The primary difference between hard-decision and soft-decision Viterbi de¬ 
coding, is that the soft-decision algorithm cannot use a Hamming distance metric 
because of its limited resolution. A distance metric with the needed resolution is 
Euclidean distance, and to facilitate its use, the binary numbers of 1 and 0 are 
transformed to the octal numbers 7 and th respectively. This can be seen in Figure 
7.22c. where the corners of the square have been re-labeled accordingly: this allows 
us to use a pair of integers, each in the range of 0 to 7. for describing any point in 
the 64-point set. Also shown in Figure 7.22c is the point 5,4, representing an 
example of a pair of noisy code-symbol values that might stem from a 
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id) 


(e) 


Figure 7.22 (a) Mard-decision plane (b) 8*levei by 8-level soft*dedsion 
plane (c) Example of soft code symbols (d) Encoding trellis section (e) De* 
coding trellis section. 


demodulator. Imagine that the square in Figure 7.22c has coordinates x and y. 
riieii. what is the Euclidean distance be tween the noisy point 5.4 and the noiseless 
point 0,0? It is 0)^ = V4l. Similarly, if we ask what is the 

Euclidean distance between the noisv point 5,4 and the noiseless point 7,7? 

It is \/(T- 7 )= + (4 - 7)2 = Vn. 

Sofl-dccisiun Vilerbi decoding, for the most part, proceeds in the same way as 
hard-decision decoding (as described in Sections 7.3.4 and 7.3.5). The only difference 
is that Hamming distances are not used. Consider how soft-decision decoding is per- 
lormed with the use of Euclidean distances. Figure 7.22d shows the first section of an 
encoding trellis, originally presented in Figure 7.7, with the branch words trans¬ 
formed from binary to octal. Suppose that a pair of soft-decision code symbols with 
values 5.4 arrives at a decoder during Ihe first transition interval. Figure 7.22e shows 
(he first section of a decoding trellis. The metric (V4T)« tepresenting the Euclidean 
distance between the arrmng5.4 and the 0,0 branch word, is placed on the solid line. 
Similarly, the metric (Vl3). representing the Euclidean distance between the arriv¬ 
ing 5,4 and the 7,7 code symbols, is placed on the dashed line. The rest of the task, 
pruning the trellis in search of a common stem, proceeds in the same way as hard- 
decision decoding. Note that in a real convolutional decodiiigchip.the Euclidean dis¬ 
tance is not actually used for a soft-decision metric: instead, a monotonic metric that 
has similar properiie.s and is easier to implement is used. An example of such a met¬ 
ric is the Euclidean distance-squared, in which case the square-root operation shown 
above is eliminated. Further, if Ihe binary code symbols are represented with bipolar 
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values, then the Inner-product metric in Equation (7.9) can be used. With such a met¬ 
ric, we would seek maximum correlation rather than minimum distance. 


7.5 OTHER CONVOLUTIONAL DECODING ALGORITHMS 
7.5.1 Sequential Decoding 

Prior to the discovery of an optimum algorithm by VUerbi, other algorithms had been 
proposed for decoding convolutional codes. The earliest was the scquenlml decoding 
iilgoridmu originally proposed by Wozencraft |24.25] and subsequently modified by 
Fano (2|. A sequential decoder works by generating hypotheses about the transmitted 
codeword sequence; it computes a metric between these hypotheses and the received 
signal. It goes forward as long as the metric indicates that its choices are likely; other¬ 
wise, it goes backward, changing hyptHheses until, through a systematic trial-and-error 
search, it finds a likely hypothesis. Sequential decoders can be implemented to work 
with hard or soft decisions, but aofx decisions are usually avoided because they greatly 
increase the amount of the required storage and the complexity of (he computations. 

C onsider that using the encoder shown in Figure 7.3, a sequence m = 1 1 0 1 1 
i.s encoded into the codeword sequence U = 1 1 0 I 0 I Odbl.as shown in Example 
7.1. Assume that the received sequence Z is, in fact, a correct rendition of U. The 
decoder has available a replica of the encoder code tree, shown in Figure 7.h. and 
can use the received sequence Z to penetrate the tree. The decoder starts at the 
time r, node of the tree and gcneraies both paths leaving that node. The decoder 
follows that path which agrees with the received n code symbols. At the next level 
in the tree, the decoder again generates both paths leaving that node, and tollows 
the path agreeing with the .second group of n code symbols. Proceeding in this 
manner, the decoder quickly penetrates the tree. 

Suppose, however, that the received sequence Z is a corrupted version of U. 
The decoder starts at the lime /| node of the code tree and generates both paths 
leading from that node. If the received n code symbt^ls coincide with one of the 
generated paths, the decoder follows that path. If there is not agreemenl, the de¬ 
coder follows the moU likely path but keeps a cumulative count on the number of 
disagreements between the received symbols and the branch words on the path 
being followed. If two branches appear equally likely, the receiver uses an arbitrary 
rule, such as following the zero input path. At each new level in the tree, the de¬ 
coder generates new branches and compares them with the next set of n received 
code symbols. The search continues to penetrate the tree along the most likely path 
and maintains the cumulative disagreement count. 

If the disagreement count exceeds a certain number (which may increase as 
we penetrate the tree), the decoder decides that it is on an incorrect path, backs out 
of the path, and tries another. The decoder keeps track of the discarded pathways 
to avoid repeating any path excursions. For example, assume that the encoder in 
Figure 7.3 is used to encode the message sequence m = 1 1 0 1 1 into the codeword 
sequence U as shown in Example 7.1. Suppose that the lourth and seventh bits ol 
the transmitted sequence LI are received in error, such that: 
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Time: 


t, 

Message 

sequence: 

m = 

1 

Transmitted 

sequence: 


1 ] 

Received 

sequence: 

Z = 

1 i 


h 

h 

^4 


1 

0 

1 

1 

0 1 

0 1 

0 0 

01 

0 0 

0 1 

1 0 

{) 1 


Let us follow the decoder path trajectory with the aid of Figure 7.23. Assume 
that a cumulative path disagreement count of 3 is the criterion for backing up and 
trying an alternative path. On Figure 7.23 the numbers along the path trajectory 
represent the current disagreement count. 

1* At time i\ wc receive symbols II and compare them with the branch words 
leaving the first node. 

2. The most likely branch is the one with branch word 11 (corresponding to an 
input bit one or downward branching), so the decoder decides that input bit 
one is the correct decoding, and moves to the next level. 

3. At time fj, the decoder receives symbols (M) and compares them with the 
available branch words 10 and 01 at this .second level. 

4. There is no 'best** palh.so the decoder arbitrarily takes (he input bit zero (or branch 
word 10 ) path, and the disagreement count registers a disagreement of 1 . 

5. At time ly the decoder receives symbols 01 and compares them with the 
available branch words 11 and 00 at this third level. 

6 . Again, there is no best path, so the decoder arbitrarily takes the input zero 
(or branch word 11 ) path, and the disagreement count is increased to 2 . 

7. At time f 4 . the decoder receives symbols 10 and compares them with the 
available branch words 00 and 11 at this fourth level. 

8 . Again, there is no best path, so the decoder lakes the input bit zero (or 
branch word ()0) path..and the disagreement count is increased to 3. 

9. But a disagreement count of 3 is the turnaround criterion, so the decoder 
‘‘backs nut" and tries the alternative path. The disagreement counter is reset 
to 2 . 

10. ITic alternative path is the input bit one (or branch word 11) path at the ^4 
level. The decoder tries this, but compared to the received symbols 10, there 
is still a disagreement of 1. and the counter is reset to 3. 

11. But, 3 being the turnaround criterion, the decoder backs out of this path, and 
the counter is reset to 2. All of the aUematives have now been traversed at this 
L level, so the decoder returns to the node at ty and resets the counter to 1 . 

12. At the h node, the decoder compares the symbols received at time ^ 3 , namely 
01. with the untried (K) path. There is a disagreement of 1. and the counter is 
increased to 2 . 
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Figure 7.23 Sequential de¬ 
coding example. 
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13. At the ti node, the decoder follows the branch word 10 that matches its ti 
code symbols of 10. The counter remains unchanged at 2. 

14. At the u node, there is no best path, so the decoder follows the upper branch, 
as is the rule, and the counter is increased to 3. 

15. At this count, the decoder backs up. resets the counter to 2. and tries the 
alternative path at node fj. Since the alternate branch word is 00. there is a 
disagreement of I with the received code symbols 01 at time u. and the 
counter is again increased to 3. 

16. rhe decoder backs out of this path, and the counter is reset to 2. All of the 
alternatives have now been traversed at this r level, so the decoder returns to 
the node at and resets the counter to I. 

17. The decoder tries the alternative path at l^, which raises the metric to 3 .since 
there is a disagreement in two positions of the branch word. ITiis lime the 
decoder must back up all the way to the time fi node because all of the 
other paths at higher levels have been tried. Vhc counter is now decremented 
to zero. 

18. At the t; node, the dccinler now follows the branch word 01. and because 
there is a disagreement of I with the received code symbols 00 at lime h, the 
counter is increased to 1. 


The decoder continues in this way. As shown in Figure 7.23. the final path, 
which has not increased the counter to its turnaround criterion, yields the correctly 
decoded message sequence, I 1 U I 1. Sequential decoding can be viewed as a trial- 
and-error technique for searching out the correct path in the ctxle tree. It performs 
the search in a sequential manner, always operating on ju.si a single path at a lime, 
it an incorrect decision is made, subsequent extensions of the path will be wrong. 
The decoder can eventually recognize its error by monitoring the path metric. Tlie 
algorithm is similar to the case of an automobile traveler following a road map. As 
long as the traveler recognizes that the passing landmarks correspond to those on 
the map. he eontinues on the path. When he notices strange landmarks (an increase 
in his dissimilarity metric) the traveler eventually assumes that he is on an incorrect 
road, and he backs up to a point where he can now recognize the landmarks (his 
metric returns to an acceptable range). He then tries an alternative road. 


7.5.2 Comparisons and Limitations ot Viterbi 
and Sequential Decoding 


The major drawback of the Viterbi algorithm is that while error probability de¬ 
creases exponentially with constraint length, the number of code slates, and conse¬ 
quently decoder complexity, grows exponemially with constraint length. On the 
other hand, the computational complexity of the Viterbi algorithm is independent 
of channel characteristics (compared to hard-decision decoding, soft-decision de¬ 
coding requires only a trivial increase in the number of computations). Sequential 
decoding achieves asymptotically the same error probability as maximum likeli¬ 
hood decoding but without searching all possible stales. In fact, with sequential de- 
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Rate ytZ.K- 41, sequential hard decision 
Rate 1/2,A‘=41, sequential hard decision 
Rate M2.K-1, ViterbI soft decision 

Rate MXK- 7, Viterbi soft decision 

Rate 1/2, K^l, Viterbi hard decision 



Rate 1/3, K~l. Viterbi, hard decision 


Uncoded BPSK 


4 6 8 10 

£f,/iVo <dB) 


Figure 7.24 Bit error performance for various Viterbi and sequential decoding 
schemes using coherent BPSK over an AWGN channel. (Reprinted with permis¬ 
sion from J. K. Omura and B. K. Levitt, -^^oded Error Probability Evaluation for 
Antijam Communication Systems.” tEEE Trans. Commun., vol. COM30, no. 5, 
May 1982, Fig. 4, p. 900. © 1982 IEEE.) 


coding the number of states searched is essentially independent of constraint length, 
thus making it possible to use very large (K « 41) constraint lengths. This is an 
important factor in providing such low error probabilities. The major drawback 
of sequential decoding is that the number of slate metrics searched is a random 
variable. For sequential decoding, the expected number of poor hypotheses and 
backward searches is a function of the channel SNR. With a low SNR, more 
hypotheses must be tried than with a high SNR. Because o1 this variability in com¬ 
putational load, buffers must be provided to store the arriving sequences. Under low 
SNR, the received sequences must be buffered while the decoder is laboring to find 
a likely hypothesis. If the average symbol arrival rate exceeds the average symbol 
decode rale, the buffer will overflow, no matter how large it is, causing a loss of 
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dala. The sequential decoder typically puls oul error-free data until the buffer 
overflows, at which time the decoder has to go through a recovery procedure. The 
buffer overflow threshold is a very sensitive function of SNR, Therefore, an impor¬ 
tant part of a sequential decoder specification is the probahUiiy of buffer overflow. 

In Figure 7.24, some typical versus EJN,, curves for these two popular 
solutions to the convolutional decoding problem. Viterbi decoding and sequential 
decoding, illustrate their comparative performance using coherent BPSK over an 
AWGN channel. The curves compare Viterbi decoding (rates \ and \ hard decision. 
K=7) versus Viterbi decoding (rates j and .soft decision. K = 1) versus sequential 
decoding (rates \ and \ hard decision, AC = 41). One can see from Figure 7.24 that 
coding gains of appro-ximately 8 dB at = 10^ can be achieved with sequential 
decoders. Since the work of Shannon |26] foretold the potential of approximately 
11 dB of coding gain compared to uncoded BPSK. it appears that the major portion 
of what is theoretically possible can already be accomplished. 

7.5.3 Feedback Decoding 


A feedback decoder makes a hard decision on the data bit at stage / based on met¬ 
rics computed trom stages;^;' + I. j + m. where /tt is a preselected positive inte¬ 

ger. Look-ahead length, L, is defined as L = ni +- 1. the number of received code 
symbols, expressed in terms of the corresponding number of encoder input bits that 
are used to decode an information bit. The decision of whether the dala bit is zero 
or one depends on which branch the minimum Hamming distance path traverses in 
the look-ahead window from stage j to .stage j + rn. The detailed operation is best 
understood in terms of a specific example. Let us consider the use of a feedback de¬ 
coder for the rate i convolutional code shown in Figure 7,.l. Figure 7,25 illustrates 
the tree diagram and the operation of the feedback decoder for 1. « 3. That is. in 
decoding the hit at branch /. the dectvder considers the paths at branches J, j -t- 1. 
and;+ 2. 

Beginning with the first branch, the decoder computes 2' or eight cumulative 
Hamming path metrics and decides that the bit for the first branch is zero if the 
minimum distance path is contained in the upper part of the tree, and decides one 
if the minimum distance path is in the lower part of the tree. Assume that the re¬ 
ceived sequence is Z = 1 1 0 0 0 1 0 U 0 1. We now examine the eight paths from 
time /| through lime t, in the block marked A in Figure 7.24. and compute metrics 
comparing these eight paths with the first six received code symbols (three 
branches deep times two symbols per branch). Listing the Hamming cumulative 
path metrics (starting from the top path), we see that they are 

Upper-half metrics: 3,3.6, 4 

Lower-half metrics: 2.2,1.3 

We see that the minimum metric is contained in the lower part of the tree. There¬ 
fore. the first decoded bit is one (characterized by a downward movement on the 
treej. The next step is to extend the lower part of the tree (the part that survived) 
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Figure 7.25 Feedback decoding 
example. 
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one stage deeper, and again compute eight metrics, this time from through u. 
Having decoded the first two code symbols, we now slide over two code symbols to 
the right and again compute the path metrics for six code symbols. This takes place 
in the block marked B in Figure 7.25. Again, listing the metrics from top path to 
bottom path, we find that they are 

Upper-half metrics: 2,4,3,3 

Lower-half metrics: 3, 1,4,4 

For the assumed received sequence, the minimum metric is found in the lower half 
of block B. Therefore, the second decoded bit is one. 

The same procedure continues until the entire message is decoded. The 
decoder is called a feedback decoder because the detection decisions are fed hack 
to the decoder in determining the subset of code paths that are to be considered 
next. On the BSC, the feedback decoder can perform nearly as well as the Viterbi 
decoder [17] in that it can correct all the more probable error patterns, namely all 
those of weight {df ~ l)/2 or less, where df is the free distance of the code. An 
important design parameter for feedback convolutional decoders is L. the look¬ 
ahead length. Increasing L increases the coding gain but also increases the decoder 
implementation complexity. 


7.6 CONCLUSION 

In the last decade, coding emphasis has been in the area of convolutional codes 
since in almost every application, convolutional codes outperform block codes for 
the same implementation complexity of the encoder-decoder. For satellite commu¬ 
nication channels, forward error correction techniques can easily reduce the 
required SNR for a specified error performance by 5 to 6 dB. This coding gain 
can translate directly into an equivalent reduction in required satellite effective 
radiated power (EIRP), with consequently reduced satellite weight and cost. 

In this chapter we have outlined the essential structural difference between 
block codes and convolutional codes—the fact that rate Mn convolutional codes 
have a memory of the prior K - \ bits, where K is the encoder constraint length. 
With such memory, the encoding of each input data bit not only depends on the 
value of that bit but on the values of the - 1 input bits that precede it. We pre¬ 
sented the decoding problem in the context of the maximum likelihood algorithm, 
examining all the candidate codeword sequences which could possibly be created 
by the encoder, and selecting the one that appears statistically most likely; the deci¬ 
sion is based on a distance metric for the received code symbols. The error perfor¬ 
mance analysis of convolutional codes is more complicated than the simple 
binomial expansion describing the error performance of many block codes. We laid 
out the concept of free distance, and we presented the relationship between free 
distance and error performance in terms of bounds. We also described the basic 
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idea behind sequential decoding and feedback decoding and showed some compar¬ 
ative performance curves and tables for various coding schemes. 
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PROBLEMS 

7.1. Draw the slate diagram, tree diagram, and irclKs diagram for the AT = 3, rale \ code 
generated by 

g,(A0-A*+Ar2 
g:{X) = 1 + AT 
ty(X} * 1 + Y+ Y" 

7.2. Given a /C » 3. rate binary convolutional code with the partially completed stale 
diagram shown in Figure P7.I, find the complete state diagram and sketch a diagram 
for the encoder. 

7.3. Draw the stale diagram, tree diagram, and trellis diagram for the convolutional 
encoder characterized by the block diagram in Figure P7.2. 

7.4. Suppose that you were trying to find the quickest way to get from London to Vienna 
by boat or train. The diagram in Figure P7-3 was constructed from various schedules. 
The tebels on each path are (ravel times. Using the Viterbi algorithm, find the fastest 


Figure P7.1 
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route from London lo Vienna. In a general sense, explain how the algorithm works, 
what calculations must be made, and whai information must be retained in the mem¬ 
ory used by the algorithm. 

7.5. Consider the convolutional encoder shown in Figure P7.4. 

(a) Write the connection vectors and polynomials for this encoder. 

(b) Draw the slate diagram, tree diagram, and trellis diagram- 

7.6. What is the impulse response of the encoder of Problem* 7.5? Using the impulse 
response, determine the output sequence when (he input is 1 0 1. Verify by using 
the generator polynomials. 

7.7. Does the encoder of Problem 7.5 exhibit the properties of catastrophic error 
propagation? Justify your answer with an example. 

7.H. Find the free distance of the encoder of Problem 7.3 by the transfer function method. 
7.9. Let the codewords of a coding scheme be 

a »OOOUOO 
b » 101010 
c» oloiol 
d » 1 1 1 111 

If the received ^sequence over a binary symmetric channel Is 1 I 1 0 1 0 and a 
maximum likelihood decoder is used, what will be the decoded symbol? 

7.10. Consider that the rale J encoder of Figure 7.3 is used over a binary symmetric 
channel (BSC). Assume that the Initial encoder state is the 00 stale. At the output of 
the BSC, the sequence Z - (1 1 0 0 001 01 I rest all ‘'0") is received. 
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(a) Find the maximum likelihood path through the trellis diagram, and determine 
the first 5 decoded information bits. If a tic occurs between any two merged 
paths, choose the upper branch entering the particular state. 

(b) Identify any channel bits in Z (hat were inverted by the channel during 
iran.smission. 

7.11. Determine which of the following rate J codes arc catastrophic. 

(a) g,(Ar) = A^, g 2 (Al = l+A* + ^^ 

(b) + 8,rA0=l+A'' 

(c) g,(;0*l + A' + Ar\ fc(A0 = l+A' + A''' + A^ 

<d) g,(A:i«l + A'++ g^AO^l + A-' + A^ 

(e) r,(A:)* 1+A:" + A^+A^, g^(Al = l + a*'+ A^ 

(f) g((A0**l+Ar^ + A*. g2(Ar) = l+A' + Ar2 + A^ 

7.12. (a) Consider a coherently detected BPSK signal encoded with the encoder shown in 

Figure 7.3. Find an upper bound on the bit error probability, if the available 
is 6 dB. Assume hard decision decoding. 

(b) Compare Pg with the uncoded case and calculate the improvement factor. 

7.13. Using sequential decoding, illustrate the path along the tree diagram shown in 
Figure 7.22 when the received sequence is 01 110001 11. The backup criterion 
is three disagreements. 

7.14. Repeat the decoding example of Problem 7.13 using feedback decoding, with a look¬ 
ahead length of 3. In the event of a (ie, select the upper half of the tree. 


Figure P7.5 
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7.15- Figure P7.5 depicts a constraint length 2 coDvolutional encoder 

(a) Draw the slate diagram, tree diagram, and Irellis diagram. 

lb) Assume that a received message from this encoder is 1 10 010. Use a feedback 
decoding algorithm with a look-ahead length of 2 to decode the c^ed message 
sequence. 

7.16. Using the branch word information on the encoder trellis of Figure 7.7, decode the 
sequence Z = (01 11 00 01 11 rest all -O** ). using hard-dccision Viterbi decoding. 

7.17. Consider the rate j convolutional encoder shown in Figure P7.6. In this encoder, 
k bits at a lime are shifted into the encoder and /; - 3 bits arc generated at the 
encoder output. There arc A AC = 4 stages in the register, and the constraint length is 
K-2in units of 2-bit bytes. The stale of the encoder is defined as the contents of the 
rightmost K - 1 A-tupIe stages. Draw the stale diagram, the tree diagram, and the 
trellis diagram. 

7-18. Find the ratio of the prcdciection signal-lo-noise spectral density, in decibels, 

required to yield a decoded data rate of 1 Mbit/s with error probability of 10 ^ 
Assume binary noncoherent FSK modulation- Also, assume convolutional encoding 
with the decoder relationship 

Pg » 2000 pf 

where p. and Pg arc bit error probabilities into and out of the decoder, respectively. 

7.19. Using Table 7.4, devise a - 4. rate i binary convolutional encoder. 

(a) Draw the circuit. 

(b) Draw the encoding trellis showing its states and branch words, 

(c) Configure the cells that would be implemented in an ACS algorilhm. 

7.20. For the K = 3. rale j code described by the encoder circuit of Figure 7.3, perform 
soft-decision decixhng for the following demodulated sequence. The signals are 
8-level quantized integers in the range of 0 to 7. The level 0 represents the perfect bi¬ 
nary 0, and the level 7 represents the perfect binary 1. If the digits into the decoder 
are: 6.7.5.3.1.0.1.1.2. where the leftmost digit is the earliest, use a decoding Ircllis 
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diagram to decode Ihe firs! three data bits. Assume that the encoder had started in 
the 00 stale, and that ihc decoding process is perfectly synchronized. 


QUESTIONS 


7.1. In convolutional encoding, why is fltishing of the register periodically performed? 
(See Sections 7.2.1 and 7.3.4.) 

7.2. Define whai is meant by the uaie of a machine. (See Section 7.2.2.) 

7J. Whal is a finite-siaie machine'} (See Section 7.2.2.) 

lA. Whal are soft decisions, and how much grearer complexity is there in the process 
of soft-decision Viterbi decoding as compared with hard decision decoding? (See 
Sections 7.3.2 and 7,4.8.) 

7.5. Whal is another (descriptive) name for a binary symmetric channel (BSC)? (See 
Section 7.3.2.1.) 

7.6. Describe the Add-Compare-Selea (ACS) compulations performed in the process of 
Viterbi decoding. (See Section 7.3.5.) 

7.7. On a trellis diagram, an error is associated with a surviving path that diverges from. 
and then remerges to the correct path. Why is it necessary for the path to remerge? 
(See Section 7.4.1.) 


EXERCISES 

Using the Companion CD, run the exercises associated with Chapter 7. 


Exercises 


435 



